Search CORE

31 research outputs found

LIMIX: genetic analysis of multiple traits

Author: Casale F.P.
Lippert C.
Rakitsch B.
Stegle O.
Publication venue: 'Cold Spring Harbor Laboratory'
Publication date: 22/05/2014
Field of study

Multi-trait mixed models have emerged as a promising approach for joint analyses of multiple traits. In principle, the mixed model framework is remarkably general. However, current methods implement only a very specific range of tasks to optimize the necessary computations. Here, we present a multi-trait modeling framework that is versatile and fast: LIMIX enables to exibly adapt mixed models for a broad range of applications with different observed and hidden covariates, and variable study designs. To highlight the novel modeling aspects of LIMIX we performed three vastly different genetic studies: joint GWAS of correlated blood lipid phenotypes, joint analysis of the expression levels of the multiple transcript-isoforms of a gene, and pathway-based modeling of molecular traits across environments. In these applications we show that LIMIX increases GWAS power and phenotype prediction accuracy, in particular when integrating stepwise multi-locus regression into multi-trait models, and when analyzing large numbers of traits. An open source implementation of LIMIX is freely available at: https://github.com/PMBio/limix

MDC Repository

Expression QTLs Mapping and Analysis: A Bayesian Perspective.

The aim of expression Quantitative Trait Locus (eQTL) mapping is the identification of DNA sequence variants that explain variation in gene expression. Given the recent yield of trait-associated genetic variants identified by large-scale genome-wide association analyses (GWAS), eQTL mapping has become a useful tool to understand the functional context where these variants operate and eventually narrow down functional gene targets for disease. Despite its extensive application to complex (polygenic) traits and disease, the majority of eQTL studies still rely on univariate data modeling strategies, i.e., testing for association of all transcript-marker pairs. However these "one at-a-time" strategies are (1) unable to control the number of false-positives when an intricate Linkage Disequilibrium structure is present and (2) are often underpowered to detect the full spectrum of trans-acting regulatory effects. Here we present our viewpoint on the most recent advances on eQTL mapping approaches, with a focus on Bayesian methodology. We review the advantages of the Bayesian approach over frequentist methods and provide an empirical example of polygenic eQTL mapping to illustrate the different properties of frequentist and Bayesian methods. Finally, we discuss how multivariate eQTL mapping approaches have distinctive features with respect to detection of polygenic effects, accuracy, and interpretability of the results

Crossref

Apollo (Cambridge)

Genetic variants and their interactions in disease risk prediction – machine learning and network perspectives

Author: 1000 Genomes Project
A Ashworth
A Burga
A Califano
A Galvan
A Gyenesei
A Statnikov
A Torkamani
A Torkamani
AL Barabási
AL Hopkins
B Lehner
B Lehner
B Maher
B Rakitsch
BA McKinney
BA McKinney
BS Srinivasan
C Ambroise
C Kooperberg
C Tian
C Winter
CG Lambert
CS Greene
D Merico
D Urbach
DJ Balding
DM Evans
DW Aha
DW Huang
DW Huang
E Lee
EA Ashley
EE Eichler
EE Schadt
ES Lander
F Barrenäs
G Bebek
G Gibson
G Hannum
G Peng
GK Chen
GM Clarke
H Eleftherohorinou
H Holm
H Zhong
HJ Cordell
HY Chuang
I Feldman
I Guyon
I König
I Surakka
J Corander
J Jakobsdottir
J Kruppa
J Tuikkala
J Yang
JD Iglehart
JH Moore
JH Moore
K Askland
K Wang
KA Pattin
KS Reynolds
L Luo
M Ladouceur
M Michaut
M Mooney
M Smoot
M Vidal
MA Heiskanen
MD Ritchie
MJ Sillanpää
NA Lavender
NF Marko
O Lavi
O Zuk
P Beltrao
P Donnelly
P Kraft
P Sebastiani
P Smialowski
PC Phillips
PJ Castaldi
Q He
R Braun
R Jelier
R Makowsky
R Simon
RO Lindén
S Lee
S Okser
S Ripatti
S Varma
SE Baranzini
Sebastian Okser
SJ Dixon
SW Hartley
T Hu
T Ideker
T Pahikkala
T Peltola
T Schupbach
TA Manolio
Tapio Pahikkala
Tero Aittokallio
TS Deisboeck
TT Wu
U Ober
U Ober
V Bansal
VK Ramanan
W Huang
Wellcome Trust Case Control Consortium
WG Kaelin Jr
Y Saeys
Z Wang
Z Wei
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

It is all in the noise: Efficient multi-task Gaussian process inference with structured residuals

Author: Borgwardt K.
Lippert C.
Rakitsch B.
Stegle O.
Publication venue
Publication date: 01/01/2013
Field of study

Multi-task prediction methods are widely used to couple regressors or classification models by sharing information across related tasks. We propose a multi-task Gaussian process approach for modeling both the relatedness between regressors and the task correlations in the residuals, in order to more accurately identify true sharing between regressors. The resulting Gaussian model has a covariance term in form of a sum of Kronecker products, for which efficient parameter inference and out of sample prediction are feasible. On both synthetic examples and applications to phenotype prediction in genetics, we find substantial benefits of modeling structured noise compared to established alternatives

MDC Repository

MPG.PuRe

A Lasso multi-marker mixed model for association mapping with population structure correction

Author: Borgwardt K.
Lippert C.
Rakitsch B.
Stegle O.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2013
Field of study

Motivation: Exploring the genetic basis of heritable traits remains one of the central challenges in biomedical research. In traits with simple Mendelian architectures, single polymorphic loci explain a significant fraction of the phenotypic variability. However, many traits of interest seem to be subject to multifactorial control by groups of genetic loci. Accurate detection of such multivariate associations is non-trivial and often compromised by limited statistical power. At the same time, confounding influences, such as population structure, cause spurious association signals that result in false-positive findings. Results: We propose linear mixed models LMM-Lasso, a mixed model that allows for both multi-locus mapping and correction for confounding effects. Our approach is simple and free of tuning parameters; it effectively controls for population structure and scales to genome-wide datasets. LMM-Lasso simultaneously discovers likely causal variants and allows for multi-marker-based phenotype prediction from genotype. We demonstrate the practical use of LMM-Lasso in genome-wide association studies in Arabidopsis thaliana and linkage mapping in mouse, where our method achieves significantly more accurate phenotype prediction for 91% of the considered phenotypes. At the same time, our model dissects the phenotypic variability into components that result from individual single nucleotide polymorphism effects and population structure. Enrichment of known candidate genes suggests that the individual associations retrieved by LMM-Lasso are likely to be genuine. Availability: Code available under http://webdav.tuebingen. mpg.de/u/karsten/Forschung/research.html

Publikationsserver der Universität Tübingen

MDC Repository

MPG.PuRe

It is all in the noise: efficient multi-task Gaussian process inference with structured residuals

Author: Borgwardt K.
Lippert C.
Rakitsch B.
Stegle O.
Publication venue
Publication date: 01/12/2013
Field of study

MDC Repository

ccSVM: correcting Support Vector Machines for confounding factors in biological data classification

Author: Atwell
B. Rakitsch
Berry
Borgwardt
Bullinger
Holsboer
K. Borgwardt
Kang
L. Li
Marchini
Noble
Palma
Price
To
Valk
Warnat
Publication venue: Oxford University Press
Publication date: 01/01/2011
Field of study

Motivation: Classifying biological data into different groups is a central task of bioinformatics: for instance, to predict the function of a gene or protein, the disease state of a patient or the phenotype of an individual based on its genotype. Support Vector Machines are a wide spread approach for classifying biological data, due to their high accuracy, their ability to deal with structured data such as strings, and the ease to integrate various types of data. However, it is unclear how to correct for confounding factors such as population structure, age or gender or experimental conditions in Support Vector Machine classification

Repository for Publications and Research Data

Crossref

PubMed Central

MPG.PuRe

Genetic architecture of nonadditive inheritance in Arabidopsis thaliana hybrids

Author: Borgwardt K.
Chae E.
Grimm D.
Habring-Müller A.
Koenig D.
Martin Pizarro C.
Rakitsch B.
Seymour D.
Vasseur F.
Weigel D.
Publication venue: 'Proceedings of the National Academy of Sciences'
Publication date: 01/01/2016
Field of study

The ubiquity of nonparental hybrid phenotypes, such as hybrid vigor and hybrid inferiority, has interested biologists for over a century and is of considerable agricultural importance. Although examples of both phenomena have been subject to intense investigation, no general model for the molecular basis of nonadditive genetic variance has emerged, and prediction of hybrid phenotypes from parental information continues to be a challenge. Here we explore the genetics of hybrid phenotype in 435 Arabidopsis thaliana individuals derived from intercrosses of 30 parents in a half diallel mating scheme. We find that nonadditive genetic effects are a major component of genetic variation in this population and that the genetic basis of hybrid phenotype can be mapped using genome-wide association (GWA) techniques. Significant loci together can explain as much as 20% of phenotypic variation in the surveyed population and include examples that have both classical dominant and overdominant effects. One candidate region inherited dominantly in the half diallel contains the gene for the MADS-box transcription factor AGAMOUS-LIKE 50 (AGL50), which we show directly to alter flowering time in the predicted manner. Our study not only illustrates the promise of GWA approaches to dissect the genetic architecture underpinning hybrid performance but also demonstrates the contribution of classical dominance to genetic variance

ZENODO

Dryad Digital Repository (Duke University)

PubMed Central

Publikationsserver der Universität Tübingen

eScholarship - University of California

Electronic Archiving System

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

MPG.PuRe

Genomic Profiles of Diversification and Genotype–Phenotype Association in Island Nematode Lineages

Author: Borgwardt K.
Grimm D.
Leaver M.
McGaughran A.
Meyer J.
Moreno E.
Morgan K.
Rakitsch B.
Rödelsperger C.
Serobyan V.
Sommer R.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/09/2016
Field of study

Understanding how new species form requires investigation of evolutionary forces that cause phenotypic and genotypic changes among populations. However, the mechanisms underlying speciation vary and little is known about whether genomes diversify in the same ways in parallel at the incipient scale. We address this using the nematode, Pristionchus pacificus, which resides at an interesting point on the speciation continuum (distinct evolutionary lineages without reproductive isolation), and inhabits heterogeneous environments subject to divergent environmental pressures. Using whole genome re-sequencing of 264 strains, we estimate FST to identify outlier regions of extraordinary differentiation (∼1.725 Mb of the 172.5 Mb genome). We find evidence for shared divergent genomic regions occurring at a higher frequency than expected by chance among populations of the same evolutionary lineage. We use allele frequency spectra to find that, among lineages, 53% of divergent regions are consistent with adaptive selection, whereas 24% and 23% of such regions suggest background selection and restricted gene flow, respectively. In contrast, among populations from the same lineage, similar proportions (34-48%) of divergent regions correspond to adaptive selection and restricted gene flow, whereas 13-22% suggest background selection. Because speciation often involves phenotypic and genomic divergence, we also evaluate phenotypic variation, focusing on pH tolerance, which we find is diverging in a manner corresponding to environmental differences among populations. Taking a genome-wide association approach, we functionally validate a significant genotype-phenotype association for this trait. Our results are consistent with P. pacificus undergoing heterogeneous genotypic and phenotypic diversification related to both evolutionary and environmental processes

MPG.PuRe

Does the inclusion of rare variants improve risk prediction?

Author: A Hoerl
B Rakitsch
D Levy
Erin Austin
H Kang
H Zou
J Fan
J Friedman
P Breheny
S Yang
Wei Pan
X Shen
Xiaotong Shen
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref